Law of total variance

In probability theory, the law of total variance^[1], Eve's Law, or variance decomposition formula states that if X and Y are random variables on the same probability space, and the variance of Y is finite, then

$\operatorname{var}(Y)=\operatorname{E}(\operatorname{var}(Y\mid X))%2B\operatorname{var}(\operatorname{E}(Y\mid X)).\,$

In language perhaps better known to statisticians than to probabilists, the two terms are the "unexplained" and the "explained component of the variance" (cf. fraction of variance unexplained, explained variation).

The nomenclature in this article's title parallels the phrase law of total probability. Some writers on probability call this the "conditional variance formula" or use other names.

Note that the conditional expected value E( Y | X ) is a random variable in its own right, whose value depends on the value of X. Notice that the conditional expected value of Y given the event X = y is a function of y (this is where adherence to the conventional rigidly case-sensitive notation of probability theory becomes important!). If we write E( Y | X = y ) = g(y) then the random variable E( Y | X ) is just g(X). Similar comments apply to the conditional variance.

1 Proof
2 The square of the correlation
3 Higher moments
4 See also
5 References

Proof

The law of total variance can be proved using the law of total expectation: First,

$\operatorname{Var}[Y] = \operatorname{E}[Y^2] - \operatorname{E}[Y]^2$

from the definition of variance. Then we apply the law of total expectation by conditioning on the random variable X:

$= \operatorname{E}\left[\operatorname{E}[Y^2|X]\right] - \operatorname{E}\left[\operatorname{E}[Y|X]\right]^2$

Now we rewrite the conditional second moment of Y in terms of its variance and first moment:

$= \operatorname{E}\!\left[\operatorname{Var}[Y|X] %2B \operatorname{E}[Y|X]^2\right] - \operatorname{E}[\operatorname{E}[Y|X]]^2$

Since expectation of a sum is the sum of expectations, we can now regroup the terms:

$= \operatorname{E}[\operatorname{Var}[Y|X]] %2B \left(\operatorname{E}\left[\operatorname{E}[Y|X]^2] - \operatorname{E}[\operatorname{E}[Y|X]\right]^2\right)$

Finally, we recognize the terms in parentheses as the variance of the conditional expectation E[Y|X]:

$= \operatorname{E}\left[\operatorname{Var}[Y|X]\right] %2B \operatorname{Var}\left[\operatorname{E}[Y|X]\right]$

The square of the correlation

In cases where (Y, X) are such that the conditional expected value is linear; i.e., in cases where

$\operatorname{E}(Y \mid X)=aX%2Bb,\,$

it follows from the bilinearity of Cov(-,-) that

$a={\operatorname{Cov}(Y,X) \over \operatorname{Var}(X)}$

and

$b=\operatorname{E}(Y)-{\operatorname{Cov}(Y,X)\over \operatorname{Var}(X)} \operatorname{E}(X)$

and the explained component of the variance divided by the total variance is just the square of the correlation between Y and X; i.e., in such cases,

${\operatorname{Var}(\operatorname{E}(Y\mid X)) \over \operatorname{Var}(Y)} = \operatorname{Corr}(Y,X)^2.\,$

One example of this situation is when (Y, X) have a bivariate normal (Gaussian) distribution.

Higher moments

A similar law for the third central moment μ₃ says

$\mu_3(Y)=\operatorname{E}(\mu_3(Y\mid X))%2B\mu_3(\operatorname{E}(Y\mid X)) %2B3\,\operatorname{cov}(\operatorname{E}(Y\mid X),\operatorname{var}(Y\mid X)).\,$

For higher cumulants, a simple and elegant generalization exists. See law of total cumulance.

References

^ Neil A. Weiss, A Course in Probability, Addison–Wesley, 2005, pages 385–386.

Billingsley, Patrick (1995). Probability and Measure. New York, NY: John Wiley & Sons, Inc.. ISBN 0-471-00710-2. (Problem 34.10(b))

Law of total variance

Contents

Proof

The square of the correlation

Higher moments

See also

References